nn algorithm
NN algorithm, we
We thank all reviewers for their valuable comments. Hereafter, we address comments shared by several reviewers. However, our general Theorem 3.2 asks for the continuity of the cumulative distribution function In particular Section 4.2 presents a randomized This randomization technique used to circumvent Assumption 3.1 is rather simple Besides, we mention that Assumption 3.1 is one of the limitation The obtained rates of convergence relies on Proposition C.1 (given in the supplementary material), Section 4. That is to say the rate of convergence given in Theorem 4.4 applies to estimators which can be written as Theorem 4.4 shows, from a theoretical perspective, the dependency with respect to the sample size of labeled and
Efficient Quantum Approximate $k$NN Algorithm via Granular-Ball Computing
Xia, Shuyin, Tian, Xiaojiang, Yuan, Suzhen, Deng, Jeremiah D.
High time complexity is one of the biggest challenges faced by $k$-Nearest Neighbors ($k$NN). Although current classical and quantum $k$NN algorithms have made some improvements, they still have a speed bottleneck when facing large amounts of data. To address this issue, we propose an innovative algorithm called Granular-Ball based Quantum $k$NN(GB-Q$k$NN). This approach achieves higher efficiency by first employing granular-balls, which reduces the data size needed to processed. The search process is then accelerated by adopting a Hierarchical Navigable Small World (HNSW) method. Moreover, we optimize the time-consuming steps, such as distance calculation, of the HNSW via quantization, further reducing the time complexity of the construct and search process. By combining the use of granular-balls and quantization of the HNSW method, our approach manages to take advantage of these treatments and significantly reduces the time complexity of the $k$NN-like algorithms, as revealed by a comprehensive complexity analysis.
Learning the Exact Time Integration Algorithm for Initial Value Problems by Randomized Neural Networks
We present a method leveraging extreme learning machine (ELM) type randomized neural networks (NNs) for learning the exact time integration algorithm for initial value problems (IVPs). The exact time integration algorithm for non-autonomous systems can be represented by an algorithmic function in higher dimensions, which satisfies an associated system of partial differential equations with corresponding boundary conditions. Our method learns the algorithmic function by solving this associated system using ELM with a physics informed approach. The trained ELM network serves as the learned algorithm and can be used to solve the IVP with arbitrary initial data or step sizes from some domain. When the right hand side of the non-autonomous system exhibits a periodicity with respect to any of its arguments, while the solution itself to the problem is not periodic, we show that the algorithmic function is either periodic, or when it is not, satisfies a well-defined relation for different periods. This property can greatly simplify the algorithm learning in many problems. We consider explicit and implicit NN formulations, leading to explicit or implicit time integration algorithms, and discuss how to train the ELM network by the nonlinear least squares method. Extensive numerical experiments with benchmark problems, including non-stiff, stiff and chaotic systems, show that the learned NN algorithm produces highly accurate solutions in long-time simulations, with its time-marching errors decreasing nearly exponentially with increasing degrees of freedom in the neural network. We compare extensively the computational performance (accuracy vs.~cost) between the current NN algorithm and the leading traditional time integration algorithms. The learned NN algorithm is computationally competitive, markedly outperforming the traditional algorithms in many problems.
High-rate discretely-modulated continuous-variable quantum key distribution using quantum machine learning
Liao, Qin, Liu, Jieyu, Huang, Anqi, Huang, Lei, Fei, Zhuoying, Fu, Xiquan
Continuous-variable quantum key distribution [1] is designed to implement point-to-point secret key distribution, its security is guaranteed by the fundamental laws of quantum physics [2]. In a basic version of CVQKD [3], the sender, called Alice, encodes secret key bits in the phase space of coherent states and sends them to an insecure quantum channel, while the receiver, called Bob, measures these incoming signal states with coherent detection. After several steps of data post-processing, a string of secret keys can be finally shared by Alice and Bob. One of the advantages of CVQKD is that it is compatible with most existing commercial telecommunication technologies [4-6], making it easier to integrate into real-world communication links. In general, a CVQKD system is mainly composed of quantum signal processing and data-postprocessing [7]. The former part corresponds to signal modulation, transmission, and measurement, aiming to generate a raw key, while the latter part corresponds to data reconciliation, parameter estimation, and privacy amplification, attempting to extract the final secret key from the raw key. In CVQKD, secret key rate and maximal transmission distance are generally a pair of crucial performance indicators. For a specific CVQKD system, however, there is tradeoff between the secret key rate and the maximal transmission distance: the longer the transmission distance, the lower the secret key rate, and vice versa. The main reason is that the continuous-variable quantum signal used to carry the secret key is extremely weak.
An Adjusted Nearest Neighbor Algorithm Maximizing the F-Measure from Imbalanced Data
Viola, Rรฉmi, Emonet, Rรฉmi, Habrard, Amaury, Metzler, Guillaume, Riou, Sรฉbastien, Sebban, Marc
In this paper, we address the challenging problem of learning from imbalanced data using a Nearest-Neighbor (NN) algorithm. In this setting, the minority examples typically belong to the class of interest requiring the optimization of specific criteria, like the F-Measure. Based on simple geometrical ideas, we introduce an algorithm that reweights the distance between a query sample and any positive training example. This leads to a modification of the Voronoi regions and thus of the decision boundaries of the NN algorithm. We provide a theoretical justification about the weighting scheme needed to reduce the False Negative rate while controlling the number of False Positives. We perform an extensive experimental study on many public imbalanced datasets, but also on large scale non public data from the French Ministry of Economy and Finance on a tax fraud detection task, showing that our method is very effective and, interestingly, yields the best performance when combined with state of the art sampling methods.
Two Major Difficulties in AI and One Applied Solution
These are the heydays of AI. New and exciting applications are found on an almost daily basis, proving that the AI promise was not in vain. However, this success comes with a price, and I would like to highlight two such prices and one solution on the path to remedy. In the past two years, AI is suffering from an increasing problem of trust. ML and NN algorithms' lack of transparency, fairness, and safety[i] (AKA the black box), together with their weaknesses in accounting for specific situations impedes the desired adoption rate of AI algorithms and creates frustration in different domains of our modern life.
A Graph-Based Semi-Supervised k Nearest-Neighbor Method for Nonlinear Manifold Distributed Data Classification
Tu, Enmei, Zhang, Yaqian, Zhu, Lin, Yang, Jie, Kasabov, Nikola
$k$ Nearest Neighbors ($k$NN) is one of the most widely used supervised learning algorithms to classify Gaussian distributed data, but it does not achieve good results when it is applied to nonlinear manifold distributed data, especially when a very limited amount of labeled samples are available. In this paper, we propose a new graph-based $k$NN algorithm which can effectively handle both Gaussian distributed data and nonlinear manifold distributed data. To achieve this goal, we first propose a constrained Tired Random Walk (TRW) by constructing an $R$-level nearest-neighbor strengthened tree over the graph, and then compute a TRW matrix for similarity measurement purposes. After this, the nearest neighbors are identified according to the TRW matrix and the class label of a query point is determined by the sum of all the TRW weights of its nearest neighbors. To deal with online situations, we also propose a new algorithm to handle sequential samples based a local neighborhood reconstruction. Comparison experiments are conducted on both synthetic data sets and real-world data sets to demonstrate the validity of the proposed new $k$NN algorithm and its improvements to other version of $k$NN algorithms. Given the widespread appearance of manifold structures in real-world problems and the popularity of the traditional $k$NN algorithm, the proposed manifold version $k$NN shows promising potential for classifying manifold-distributed data.